Fault-tolerant design of the IBM pSeries 690 system using POWER4 processor technology
نویسندگان
چکیده
The POWER4-based p690 systems offer the highest performance of the IBM eServer pSeries line of computers. Within the general-purpose UNIX server market, they also offer the highest levels of concurrent error detection, fault isolation, recovery, and availability. High availability is achieved by minimizing component failure rates through improvements in the base technology, and through design techniques that permit hardand soft-failure detection, recovery, and isolation, repair deferral, and component replacement concurrent with system operation. In this paper, we discuss the faulttolerant design techniques that were used for array, logic, storage, and I/O subsystems for the p690. We also present the diagnostic strategy, fault-isolation, and recovery techniques. New features such as POWER4 synchronous machine-check interrupt, PCI bus error recovery, array dynamic redundancy, and minimum-element dynamic reconfiguration are described. The design process used to verify error detection, fault isolation, and recovery is also described.
منابع مشابه
Ab Initio Quantum Chemistry on the IBM pSeries 690
In this study, we compare the performance of the POWER3 processor and the new IBM eServer pSeries 690. The pSeries 690 can scale up to 32-way POWER4 processor at 1.3 GHz and 1.1 GHz. To perform this comparison we used the Gaussian98 Revision A.11 series of electronic structure programs. It is an integrated system to model molecular systems under a variety of conditions, carrying out its calcula...
متن کاملSome Practical Suggestions for Performing Gaussian Benchmarks on a pSeries 690 System
Gaussian98 is a connected series of programs from Gaussian, Inc., that can perform a variety of semi-empirical, ab initio, and density functional theory calculations. For more than 20 years, the Gaussian program has been extensively used at universities, and in the pharmaceutical and chemical industries to carry out basic research in the simulation and elucidation of new pharmaceuticals or mate...
متن کاملSome Practical Suggestions for Performing NCBI BLAST Benchmarks on a pSeries TM 690 System
In this study, we present a series of benchmarks carried out on pSeriesTM servers using BLAST (Basic Local Alignment Search Tool), one of the most popular and widely used programs for similarity searching. In particular, we compare the performance of the POWER3 series with the new pSeries 690 eServer. Scalability is measured as a function of the type of query (input fragment sequence) and type ...
متن کاملKevin Reick IBM TO ACHIEVE RELIABILITY GOALS
Fault-tolerant computing is a mature art whose techniques have migrated from mainframe computers to other product classes. This migration has involved tradeoffs between failure probabilities, defined availability requirements, performance implications, and product cost. At IBM, we have incorporated fault tolerance in designing Power4 systems—servers comprised of several Power4 chips. The Power4...
متن کاملPOWER4 system microarchitecture
The IBM POWER4 is a new microprocessor organized in a system structure that includes new technology to form systems. The name POWER4 as used in this context refers not only to a chip, but also to the structure used to interconnect chips to form systems. In this paper we describe the processor microarchitecture as well as the interconnection architecture employed to form systems up to a 32-way s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IBM Journal of Research and Development
دوره 46 شماره
صفحات -
تاریخ انتشار 2002